Types of Trees in Vancouver's Neighbourhoods¶

By Elisabeth Cadman

Vancouver is a lush, green city with abundant trees lining the streets in all its neighbourhoods. But do some neighbourhoods have vastly more trees than other neighbourhoods? And what different varieties of trees can be found in each neighbourhood? I want to know what kind of trees I'm likely to find when I'm walking through Vancouver, so let's see what we can find out.

trees.jpeg Source

Dataset Summary Investigation¶

The data used in this analysis is from the City of Vancouver Vancouver Street Trees dataset. The dataset used in this analysis was prepared by the developers/instructors of the UBC Extended Learning Data Visualization Course. The dataset can be found here.

In [1]:
from hashlib import sha1
import altair as alt
alt.data_transformers.enable('default', max_rows=1000000)
import pandas as pd
import numpy as np
import json
In [2]:
url = 'https://raw.githubusercontent.com/UBC-MDS/data_viz_wrangled/main/data/Trees_data_sets/small_unique_vancouver.csv'
trees_df = pd.read_csv(url, parse_dates=['date_planted'])
trees_df.info
Out[2]:
<bound method DataFrame.info of       Unnamed: 0      std_street       on_street   species_name  \
0          10747       W 20TH AV       W 20TH AV    PLATANOIDES   
1          12573       W 18TH AV       W 18TH AV     CALLERYANA   
2          29676         ROSS ST         ROSS ST          NIGRA   
3           8856        DOMAN ST        DOMAN ST      AMERICANA   
4          21098  EAST BOULEVARD  EAST BOULEVARD  HIPPOCASTANUM   
...          ...             ...             ...            ...   
4995        6132       E 53RD AV       E 53RD AV      SERRULATA   
4996        5642       E 32ND AV       E 32ND AV             XX   
4997        8777       DAWSON ST       DAWSON ST     TULIPIFERA   
4998       23489       E 13TH AV       E 13TH AV    INVOLUCRATA   
4999        7450     CULLODEN ST     CULLODEN ST      CAMPESTRE   

            neighbourhood_name date_planted  diameter street_side_name  \
0                   Riley Park   2000-02-23      28.5             EVEN   
1                Arbutus-Ridge   1992-02-04       6.0              ODD   
2                       Sunset          NaT      12.0              ODD   
3                    Killarney   1999-11-12      11.0             EVEN   
4                  Shaughnessy          NaT      15.5              ODD   
...                        ...          ...       ...              ...   
4995       Victoria-Fraserview          NaT      17.0             EVEN   
4996  Kensington-Cedar Cottage   2014-01-14       3.0             EVEN   
4997                 Killarney   2002-04-15       3.5             EVEN   
4998            Mount Pleasant   2003-12-02       5.5             EVEN   
4999  Kensington-Cedar Cottage          NaT       3.0              ODD   

        genus_name assigned  ...  plant_area curb tree_id  \
0             ACER        N  ...          15    Y   21421   
1            PYRUS        N  ...           7    Y  129645   
2            PINUS        N  ...           7    Y  154675   
3         FRAXINUS        N  ...           7    Y  180803   
4         AESCULUS        Y  ...           N    Y   74364   
...            ...      ...  ...         ...  ...     ...   
4995        PRUNUS        N  ...           9    Y   47059   
4996        CORNUS        N  ...          10    N  247874   
4997  LIRIODENDRON        N  ...           7    Y  192642   
4998       DAVIDIA        N  ...           5    Y  202500   
4999          ACER        N  ...           8    Y  259433   

                      common_name height_range_id  on_street_block  \
0                    NORWAY MAPLE               4                0   
1                CHANTICLEER PEAR               2             2300   
2                   AUSTRIAN PINE               4             7800   
3             AUTUMN APPLAUSE ASH               4             6900   
4            COMMON HORSECHESTNUT               4             5200   
...                           ...             ...              ...   
4995     KWANZAN FLOWERING CHERRY               2             2200   
4996  EDDIES WHITE WONDER DOGWOOD               1             1700   
4997             ARNOLD TULIPTREE               2             6500   
4998    DOVE OR HANDKERCHIEF TREE               1              300   
4999              RED SHINE MAPLE               1             4500   

             cultivar_name root_barrier   latitude   longitude  
0                      NaN            N  49.252711 -123.106323  
1              CHANTICLEER            N  49.256350 -123.158709  
2                      NaN            N  49.213486 -123.083254  
3          AUTUMN APPLAUSE            N  49.220839 -123.036721  
4                      NaN            N  49.238514 -123.154958  
...                    ...          ...        ...         ...  
4995               KWANZAN            N  49.221161 -123.061023  
4996  EDDIE'S WHITE WONDER            N  49.241544 -123.070644  
4997                ARNOLD            N  49.224511 -123.048723  
4998                   NaN            Y  49.259208 -123.096905  
4999             RED SHINE            N  49.243772 -123.078967  

[5000 rows x 21 columns]>
In [3]:
trees_df.head()
Out[3]:
Unnamed: 0 std_street on_street species_name neighbourhood_name date_planted diameter street_side_name genus_name assigned ... plant_area curb tree_id common_name height_range_id on_street_block cultivar_name root_barrier latitude longitude
0 10747 W 20TH AV W 20TH AV PLATANOIDES Riley Park 2000-02-23 28.5 EVEN ACER N ... 15 Y 21421 NORWAY MAPLE 4 0 NaN N 49.252711 -123.106323
1 12573 W 18TH AV W 18TH AV CALLERYANA Arbutus-Ridge 1992-02-04 6.0 ODD PYRUS N ... 7 Y 129645 CHANTICLEER PEAR 2 2300 CHANTICLEER N 49.256350 -123.158709
2 29676 ROSS ST ROSS ST NIGRA Sunset NaT 12.0 ODD PINUS N ... 7 Y 154675 AUSTRIAN PINE 4 7800 NaN N 49.213486 -123.083254
3 8856 DOMAN ST DOMAN ST AMERICANA Killarney 1999-11-12 11.0 EVEN FRAXINUS N ... 7 Y 180803 AUTUMN APPLAUSE ASH 4 6900 AUTUMN APPLAUSE N 49.220839 -123.036721
4 21098 EAST BOULEVARD EAST BOULEVARD HIPPOCASTANUM Shaughnessy NaT 15.5 ODD AESCULUS Y ... N Y 74364 COMMON HORSECHESTNUT 4 5200 NaN N 49.238514 -123.154958

5 rows × 21 columns

In [4]:
trees_df.describe()
Out[4]:
Unnamed: 0 diameter civic_number tree_id height_range_id on_street_block latitude longitude
count 5000.000000 5000.000000 5000.000000 5000.000000 5000.00000 5000.000000 5000.000000 5000.000000
mean 14861.920400 12.340888 2975.707600 128682.584600 2.73440 2960.227000 49.247349 -123.107128
std 8680.023278 9.266600 2078.580429 75412.260406 1.56957 2086.861052 0.021251 0.049137
min 2.000000 0.000000 2.000000 36.000000 0.00000 0.000000 49.202783 -123.220560
25% 7192.750000 4.000000 1300.500000 61321.500000 2.00000 1300.000000 49.230152 -123.144178
50% 14870.000000 10.000000 2639.000000 130130.500000 2.00000 2600.000000 49.247981 -123.105861
75% 22366.750000 18.000000 4123.000000 191332.000000 4.00000 4100.000000 49.263275 -123.063484
max 29992.000000 71.000000 9113.000000 270750.000000 9.00000 9100.000000 49.293930 -123.023311
In [5]:
trees_df.describe(exclude='number')
/var/folders/xt/gs28_l253b16nn1nj03rtwxw0000gn/T/ipykernel_24694/1239690617.py:1: FutureWarning: Treating datetime data as categorical rather than numeric in `.describe` is deprecated and will be removed in a future version of pandas. Specify `datetime_is_numeric=True` to silence this warning and adopt the future behavior now.
  trees_df.describe(exclude='number')
Out[5]:
std_street on_street species_name neighbourhood_name date_planted street_side_name genus_name assigned plant_area curb common_name cultivar_name root_barrier
count 5000 5000 5000 5000 2363 5000 5000 5000 4950 5000 5000 2658 5000
unique 603 607 171 22 1599 4 67 2 38 2 361 176 2
top W 13TH AV CAMBIE ST SERRULATA Renfrew-Collingwood 2004-02-16 00:00:00 ODD ACER N 10 Y KWANZAN FLOWERING CHERRY KWANZAN N
freq 52 49 463 384 7 2554 1218 4564 736 4593 383 383 4679
first NaN NaN NaN NaN 1989-10-31 00:00:00 NaN NaN NaN NaN NaN NaN NaN NaN
last NaN NaN NaN NaN 2019-05-07 00:00:00 NaN NaN NaN NaN NaN NaN NaN NaN
In [6]:
genus_count = trees_df.genus_name.value_counts().rename_axis('genus_name').reset_index(name='count')
genus_count
Out[6]:
genus_name count
0 ACER 1218
1 PRUNUS 1050
2 FRAXINUS 238
3 TILIA 238
4 QUERCUS 218
... ... ...
62 NOTHOFAGUS 1
63 ARAUCARIA 1
64 SOPHORA 1
65 PTELEA 1
66 CLADRASTIS 1

67 rows × 2 columns

Now I can see that the dataframe includes 22 different neighbourhoods and 67 different genera, which are further broken down into species and common varieties. I'm going to look at which genera of trees are in which neighbourhoods.

Which Trees Are Present in Each Neighbourhood?¶

In [7]:
# base plot for genus counts
genus_count_plot = alt.Chart(genus_count).mark_bar(color='green').encode(
    alt.X('genus_name:N', sort='-y', title='Genus'),
    alt.Y('count:Q', title='Number of Trees'),
    tooltip=[alt.Tooltip('count:Q', title='Quantity')]
).properties(title='Quantity of Trees in Each Genus', height=250)
genus_count_plot
Out[7]:
In [8]:
# base plot for tree quantities in neighbourhoods
neighbourhood_trees_bar=alt.Chart(trees_df).mark_bar(color='darkblue').encode(
    alt.X('count()', title='Number of Trees'),
    alt.Y('neighbourhood_name', title='Neighbourhood'),
    tooltip=[alt.Tooltip('count()', title='Quantity')]
).properties(title={'text' : 'Number of Trees in Each Neighbourhood', 'subtitle' : 'Click on a bar to select the neighbourhood. Double-click to clear.'})
neighbourhood_trees_bar
Out[8]:
In [9]:
# add click selection to neighbourhood plot
select_neighbourhood_click = alt.selection_single(encodings=['y'], on='click')

select_neighbourhood = (neighbourhood_trees_bar.encode(
    opacity=alt.condition(select_neighbourhood_click, alt.value(1), alt.value(0.2)))
.add_selection(select_neighbourhood_click)).properties(height=300, width=200)

select_neighbourhood
Out[9]:
In [10]:
#combine plots and make them interactive
genus_per_neighbourhood = alt.Chart(trees_df).transform_filter(select_neighbourhood_click).mark_bar(color='green').encode(
    alt.X('genus_name', sort='-y', title='Genus'),
    alt.Y('count:Q', title='Number of Trees'),
    tooltip=[alt.Tooltip('count:Q', title='Quantity')]
).transform_aggregate(count='count()',groupby=['genus_name']
).transform_window(rank='rank(count())',sort=[alt.SortField('count()', order='descending')]
).add_selection(select_neighbourhood_click).properties(title='Quantity of Trees in Each Genus')

combo_plot = select_neighbourhood | genus_per_neighbourhood
combo_plot
Out[10]:

This plot answers my first question of which trees are present in which neighbourhoods. Now I'm curious about the size of trees in each neighbourhood.

How Big Are the Trees in Each Neighbourhood?¶

In [11]:
# exploring size distribution in tree genera
genus_diameter_boxplot = alt.Chart(trees_df).mark_boxplot().encode(
    alt.X('diameter'),
    alt.Y('genus_name'))
genus_diameter_boxplot
Out[11]:
In [12]:
# finding max diameter tree in each neighbourhood
neighbourhood_max = trees_df.groupby('neighbourhood_name').max().reset_index().rename(columns={'neighbourhood_name':'neighbourhood_name'})[['neighbourhood_name', 'diameter']]
neighbourhood_max
/var/folders/xt/gs28_l253b16nn1nj03rtwxw0000gn/T/ipykernel_24694/3577311502.py:1: FutureWarning: Dropping invalid columns in DataFrameGroupBy.max is deprecated. In a future version, a TypeError will be raised. Before calling .max, select only columns which should be valid for the function.
  neighbourhood_max = trees_df.groupby('neighbourhood_name').max().reset_index().rename(columns={'neighbourhood_name':'neighbourhood_name'})[['neighbourhood_name', 'diameter']]
Out[12]:
neighbourhood_name diameter
0 Arbutus-Ridge 48.0
1 Downtown 28.0
2 Dunbar-Southlands 49.5
3 Fairview 46.0
4 Grandview-Woodland 51.0
5 Hastings-Sunrise 41.0
6 Kensington-Cedar Cottage 41.5
7 Kerrisdale 57.0
8 Killarney 40.0
9 Kitsilano 71.0
10 Marpole 56.0
11 Mount Pleasant 37.5
12 Oakridge 38.0
13 Renfrew-Collingwood 38.5
14 Riley Park 40.0
15 Shaughnessy 71.0
16 South Cambie 40.0
17 Strathcona 34.0
18 Sunset 45.0
19 Victoria-Fraserview 40.0
20 West End 41.5
21 West Point Grey 46.0
In [18]:
neighbourhood_max_plot = alt.Chart(neighbourhood_max).mark_circle(color='darkblue', size=75).encode(
    alt.X('neighbourhood_name', title='Neighbourhood'),
    alt.Y('diameter', title='Diameter'),
    tooltip=('neighbourhood_name', 'diameter'))
neighbourhood_max_plot
Out[18]:
In [14]:
neighbourhood_mean = trees_df.groupby('neighbourhood_name').mean().reset_index().rename(columns={'neighbourhood_name':'neighbourhood_name'})[['neighbourhood_name', 'diameter']]
neighbourhood_mean
Out[14]:
neighbourhood_name diameter
0 Arbutus-Ridge 12.598571
1 Downtown 7.480117
2 Dunbar-Southlands 16.078115
3 Fairview 13.910821
4 Grandview-Woodland 12.603627
5 Hastings-Sunrise 12.185441
6 Kensington-Cedar Cottage 12.005600
7 Kerrisdale 13.904960
8 Killarney 10.030000
9 Kitsilano 15.080855
10 Marpole 12.419492
11 Mount Pleasant 13.401759
12 Oakridge 10.236263
13 Renfrew-Collingwood 10.308724
14 Riley Park 12.676829
15 Shaughnessy 14.162611
16 South Cambie 12.402542
17 Strathcona 12.447333
18 Sunset 11.147249
19 Victoria-Fraserview 10.456678
20 West End 12.842520
21 West Point Grey 13.256250
In [19]:
neighbourhood_mean_plot = alt.Chart(neighbourhood_mean).mark_circle(color='darkblue', size=75).encode(
    alt.X('neighbourhood_name', title='Neighbourhood'),
    alt.Y('diameter', title='Diameter'),
    tooltip=('neighbourhood_name', 'diameter'))
neighbourhood_mean_plot
Out[19]:

I'm going to include plots of both the mean tree diameters and the max tree diameters in my dashboard to get a more comprehensive view of the tree sizes.

Dashboard¶

In [21]:
neighbourhood_mean_plot = alt.Chart(neighbourhood_mean).mark_circle(color='orange', size=75).encode(
    alt.X('neighbourhood_name', title='Neighbourhood'),
    alt.Y('diameter', title='Diameter (in)'),
    tooltip=('neighbourhood_name', 'diameter'),
    opacity=alt.condition(select_neighbourhood_click, alt.value(0.9), alt.value(0.2))
).add_selection(select_neighbourhood_click).properties(title='Average Diameter of Trees in Each Neighbourhood')
neighbourhood_mean_plot

neighbourhood_max_plot = alt.Chart(neighbourhood_max).mark_circle(color='maroon', size=75).encode(
    alt.X('neighbourhood_name', title='Neighbourhood'),
    alt.Y('diameter', title='Diameter (in)'),
    tooltip=('neighbourhood_name', 'diameter'),
    opacity=alt.condition(select_neighbourhood_click, alt.value(0.9), alt.value(0.2))
).add_selection(select_neighbourhood_click).properties(title='Largest Tree Diameter in Each Neighbourhood')
neighbourhood_max_plot
dashboard = (select_neighbourhood | genus_per_neighbourhood) & (neighbourhood_mean_plot | neighbourhood_max_plot)
dashboard
Out[21]:

Concluding Remarks¶

There is a lot to learn about the trees of Vancouver's streets! I've learned that there's a huge range in the number of trees present in the different neighbourhoods - only 75 in Strathcona, vs 384 in Renfrew-Collingwood.

Another observation is that two genera of trees, acer and prunus, are the two most abundant genera in every neighbourhood, except one, Downtown, where the two most abundant genera are acer and fagus. Downtown also has the smallest average tree diameter, and smallest maximum tree diameter of all the neighbourhoods, so a future route of inquiry could be whether prunus trees tend to have large diameters, and the lack of prunus trees Downtown contributes to the smaller average tree diameter in that neighbourhood.

Interestingly, there doesn't appear to be a strong correlation between the number of trees in a neighbourhood and the average/maximum tree diameter. Renfrew-Collingwood, which has the most trees, contains smaller average tree diameters and a relatively small maximum diameter tree. My initial thought was that the neighbourhood with the most trees would also be likely to have some large trees, but this isn't the case. Perhaps the more trees a neighbourhood has, the smaller those trees tend to be in order to fit more trees in a certain area.

In future inquiries, I would like to look further into the sizes of trees of different genera, and perhaps even into whether the species of trees within a genera tend to vary widely in size.

References¶

  • Tree image is from the District of North Vancouver website
  • Data source
  • Altair documentation including:
    • Filter
    • Interactive Charts
  • UBC Data Visualization slides